- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15k
[mlir][emitc] Lower SCF using memrefs #93371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The SCF-to-EmitC lowering of yielded values currently generates
emitc.{variable,assign} ops implemeting the yield logic via memory. This
patch replaces these ops with memref.{alloc, load, store} ops in order
to separate the usage of memory from its concrete modeling in EmitC,
which is left for the memref-to-emitc conversion pass.
Since EmitC currently doesn't support lowering zero-rank memref types,
this patch uses 1D memrefs which translate to C arrays.
    | @llvm/pr-subscribers-mlir-emitc @llvm/pr-subscribers-mlir Author: Gil Rapaport (aniragil) ChangesThe SCF-to-EmitC lowering of yielded values currently generates Patch is 31.21 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93371.diff 6 Files Affected: 
 diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td
index e6d678dc1b12b..18d7abe2d707c 100644
--- a/mlir/include/mlir/Conversion/Passes.td
+++ b/mlir/include/mlir/Conversion/Passes.td
@@ -976,7 +976,7 @@ def ConvertParallelLoopToGpu : Pass<"convert-parallel-loops-to-gpu"> {
 def SCFToEmitC : Pass<"convert-scf-to-emitc"> {
   let summary = "Convert SCF dialect to EmitC dialect, maintaining structured"
                 " control flow";
-  let dependentDialects = ["emitc::EmitCDialect"];
+  let dependentDialects = ["emitc::EmitCDialect", "memref::MemRefDialect"];
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Conversion/SCFToEmitC/SCFToEmitC.cpp b/mlir/lib/Conversion/SCFToEmitC/SCFToEmitC.cpp
index 367142a520742..26fe8ba20c36f 100644
--- a/mlir/lib/Conversion/SCFToEmitC/SCFToEmitC.cpp
+++ b/mlir/lib/Conversion/SCFToEmitC/SCFToEmitC.cpp
@@ -14,11 +14,13 @@
 
 #include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/Dialect/EmitC/IR/EmitC.h"
+#include "mlir/Dialect/MemRef/IR/MemRef.h"
 #include "mlir/Dialect/SCF/IR/SCF.h"
 #include "mlir/IR/Builders.h"
 #include "mlir/IR/BuiltinOps.h"
 #include "mlir/IR/IRMapping.h"
 #include "mlir/IR/MLIRContext.h"
+#include "mlir/IR/OpDefinition.h"
 #include "mlir/IR/PatternMatch.h"
 #include "mlir/Transforms/DialectConversion.h"
 #include "mlir/Transforms/Passes.h"
@@ -63,21 +65,41 @@ static SmallVector<Value> createVariablesForResults(T op,
 
   for (OpResult result : op.getResults()) {
     Type resultType = result.getType();
-    emitc::OpaqueAttr noInit = emitc::OpaqueAttr::get(context, "");
-    emitc::VariableOp var =
-        rewriter.create<emitc::VariableOp>(loc, resultType, noInit);
+    SmallVector<OpFoldResult> dimensions = {rewriter.getIndexAttr(1)};
+    memref::AllocaOp var =
+        rewriter.create<memref::AllocaOp>(loc, dimensions, resultType);
     resultVariables.push_back(var);
   }
 
   return resultVariables;
 }
 
-// Create a series of assign ops assigning given values to given variables at
+// Create a series of load ops reading the values of given variables at
+// the current insertion point of given rewriter.
+static SmallVector<Value> readValues(SmallVector<Value> &variables,
+                                     PatternRewriter &rewriter, Location loc) {
+  Value zero;
+  if (variables.size() > 0)
+    zero = rewriter.create<arith::ConstantOp>(loc, rewriter.getIndexAttr(0));
+  SmallVector<Value> values;
+  SmallVector<Value> indices = {zero};
+  for (Value var : variables)
+    values.push_back(
+        rewriter.create<memref::LoadOp>(loc, var, indices).getResult());
+  return values;
+}
+
+// Create a series of store ops assigning given values to given variables at
 // the current insertion point of given rewriter.
 static void assignValues(ValueRange values, SmallVector<Value> &variables,
                          PatternRewriter &rewriter, Location loc) {
-  for (auto [value, var] : llvm::zip(values, variables))
-    rewriter.create<emitc::AssignOp>(loc, var, value);
+  Value zero;
+  if (variables.size() > 0)
+    zero = rewriter.create<arith::ConstantOp>(loc, rewriter.getIndexAttr(0));
+  for (auto [value, var] : llvm::zip(values, variables)) {
+    SmallVector<Value> indices = {zero};
+    rewriter.create<memref::StoreOp>(loc, value, var, indices);
+  }
 }
 
 static void lowerYield(SmallVector<Value> &resultVariables,
@@ -100,8 +122,6 @@ LogicalResult ForLowering::matchAndRewrite(ForOp forOp,
 
   // Create an emitc::variable op for each result. These variables will be
   // assigned to by emitc::assign ops within the loop body.
-  SmallVector<Value> resultVariables =
-      createVariablesForResults(forOp, rewriter);
   SmallVector<Value> iterArgsVariables =
       createVariablesForResults(forOp, rewriter);
 
@@ -115,18 +135,25 @@ LogicalResult ForLowering::matchAndRewrite(ForOp forOp,
   // Erase the auto-generated terminator for the lowered for op.
   rewriter.eraseOp(loweredBody->getTerminator());
 
+  IRRewriter::InsertPoint ip = rewriter.saveInsertionPoint();
+  rewriter.setInsertionPointToEnd(loweredBody);
+  SmallVector<Value> iterArgsValues =
+      readValues(iterArgsVariables, rewriter, loc);
+  rewriter.restoreInsertionPoint(ip);
+
   SmallVector<Value> replacingValues;
   replacingValues.push_back(loweredFor.getInductionVar());
-  replacingValues.append(iterArgsVariables.begin(), iterArgsVariables.end());
+  replacingValues.append(iterArgsValues.begin(), iterArgsValues.end());
 
   rewriter.mergeBlocks(forOp.getBody(), loweredBody, replacingValues);
   lowerYield(iterArgsVariables, rewriter,
              cast<scf::YieldOp>(loweredBody->getTerminator()));
 
   // Copy iterArgs into results after the for loop.
-  assignValues(iterArgsVariables, resultVariables, rewriter, loc);
+  SmallVector<Value> resultValues =
+      readValues(iterArgsVariables, rewriter, loc);
 
-  rewriter.replaceOp(forOp, resultVariables);
+  rewriter.replaceOp(forOp, resultValues);
   return success();
 }
 
@@ -169,6 +196,7 @@ LogicalResult IfLowering::matchAndRewrite(IfOp ifOp,
 
   auto loweredIf =
       rewriter.create<emitc::IfOp>(loc, ifOp.getCondition(), false, false);
+  SmallVector<Value> resultValues = readValues(resultVariables, rewriter, loc);
 
   Region &loweredThenRegion = loweredIf.getThenRegion();
   lowerRegion(thenRegion, loweredThenRegion);
@@ -178,7 +206,7 @@ LogicalResult IfLowering::matchAndRewrite(IfOp ifOp,
     lowerRegion(elseRegion, loweredElseRegion);
   }
 
-  rewriter.replaceOp(ifOp, resultVariables);
+  rewriter.replaceOp(ifOp, resultValues);
   return success();
 }
 
diff --git a/mlir/test/Conversion/SCFToEmitC/for.mlir b/mlir/test/Conversion/SCFToEmitC/for.mlir
index 7f90310af2189..04d3b75e97a65 100644
--- a/mlir/test/Conversion/SCFToEmitC/for.mlir
+++ b/mlir/test/Conversion/SCFToEmitC/for.mlir
@@ -47,20 +47,24 @@ func.func @for_yield(%arg0 : index, %arg1 : index, %arg2 : index) -> (f32, f32)
 // CHECK-SAME:      %[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: index) -> (f32, f32) {
 // CHECK-NEXT:    %[[VAL_3:.*]] = arith.constant 0.000000e+00 : f32
 // CHECK-NEXT:    %[[VAL_4:.*]] = arith.constant 1.000000e+00 : f32
-// CHECK-NEXT:    %[[VAL_5:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-// CHECK-NEXT:    %[[VAL_6:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-// CHECK-NEXT:    %[[VAL_7:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-// CHECK-NEXT:    %[[VAL_8:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-// CHECK-NEXT:    emitc.assign %[[VAL_3]] : f32 to %[[VAL_7]] : f32
-// CHECK-NEXT:    emitc.assign %[[VAL_4]] : f32 to %[[VAL_8]] : f32
-// CHECK-NEXT:    emitc.for %[[VAL_9:.*]] = %[[VAL_0]] to %[[VAL_1]] step %[[VAL_2]] {
-// CHECK-NEXT:      %[[VAL_10:.*]] = arith.addf %[[VAL_7]], %[[VAL_8]] : f32
-// CHECK-NEXT:      emitc.assign %[[VAL_10]] : f32 to %[[VAL_7]] : f32
-// CHECK-NEXT:      emitc.assign %[[VAL_10]] : f32 to %[[VAL_8]] : f32
+// CHECK-NEXT:    %[[VAL_5:.*]] = memref.alloca() : memref<1xf32>
+// CHECK-NEXT:    %[[VAL_6:.*]] = memref.alloca() : memref<1xf32>
+// CHECK-NEXT:    %[[VAL_7:.*]] = arith.constant 0 : index
+// CHECK-NEXT:    memref.store %[[VAL_3]], %[[VAL_5]]{{\[}}%[[VAL_7]]] : memref<1xf32>
+// CHECK-NEXT:    memref.store %[[VAL_4]], %[[VAL_6]]{{\[}}%[[VAL_7]]] : memref<1xf32>
+// CHECK-NEXT:    emitc.for %[[VAL_8:.*]] = %[[VAL_0]] to %[[VAL_1]] step %[[VAL_2]] {
+// CHECK-NEXT:      %[[VAL_9:.*]] = arith.constant 0 : index
+// CHECK-NEXT:      %[[VAL_10:.*]] = memref.load %[[VAL_5]]{{\[}}%[[VAL_9]]] : memref<1xf32>
+// CHECK-NEXT:      %[[VAL_11:.*]] = memref.load %[[VAL_6]]{{\[}}%[[VAL_9]]] : memref<1xf32>
+// CHECK-NEXT:      %[[VAL_12:.*]] = arith.addf %[[VAL_10]], %[[VAL_11]] : f32
+// CHECK-NEXT:      %[[VAL_13:.*]] = arith.constant 0 : index
+// CHECK-NEXT:      memref.store %[[VAL_12]], %[[VAL_5]]{{\[}}%[[VAL_13]]] : memref<1xf32>
+// CHECK-NEXT:      memref.store %[[VAL_12]], %[[VAL_6]]{{\[}}%[[VAL_13]]] : memref<1xf32>
 // CHECK-NEXT:    }
-// CHECK-NEXT:    emitc.assign %[[VAL_7]] : f32 to %[[VAL_5]] : f32
-// CHECK-NEXT:    emitc.assign %[[VAL_8]] : f32 to %[[VAL_6]] : f32
-// CHECK-NEXT:    return %[[VAL_5]], %[[VAL_6]] : f32, f32
+// CHECK-NEXT:    %[[VAL_14:.*]] = arith.constant 0 : index
+// CHECK-NEXT:    %[[VAL_15:.*]] = memref.load %[[VAL_5]]{{\[}}%[[VAL_14]]] : memref<1xf32>
+// CHECK-NEXT:    %[[VAL_16:.*]] = memref.load %[[VAL_6]]{{\[}}%[[VAL_14]]] : memref<1xf32>
+// CHECK-NEXT:    return %[[VAL_15]], %[[VAL_16]] : f32, f32
 // CHECK-NEXT:  }
 
 func.func @nested_for_yield(%arg0 : index, %arg1 : index, %arg2 : index) -> f32 {
@@ -77,20 +81,28 @@ func.func @nested_for_yield(%arg0 : index, %arg1 : index, %arg2 : index) -> f32
 // CHECK-LABEL: func.func @nested_for_yield(
 // CHECK-SAME:      %[[VAL_0:.*]]: index, %[[VAL_1:.*]]: index, %[[VAL_2:.*]]: index) -> f32 {
 // CHECK-NEXT:    %[[VAL_3:.*]] = arith.constant 1.000000e+00 : f32
-// CHECK-NEXT:    %[[VAL_4:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-// CHECK-NEXT:    %[[VAL_5:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-// CHECK-NEXT:    emitc.assign %[[VAL_3]] : f32 to %[[VAL_5]] : f32
+// CHECK-NEXT:    %[[VAL_4:.*]] = memref.alloca() : memref<1xf32>
+// CHECK-NEXT:    %[[VAL_5:.*]] = arith.constant 0 : index
+// CHECK-NEXT:    memref.store %[[VAL_3]], %[[VAL_4]]{{\[}}%[[VAL_5]]] : memref<1xf32>
 // CHECK-NEXT:    emitc.for %[[VAL_6:.*]] = %[[VAL_0]] to %[[VAL_1]] step %[[VAL_2]] {
-// CHECK-NEXT:      %[[VAL_7:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-// CHECK-NEXT:      %[[VAL_8:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-// CHECK-NEXT:      emitc.assign %[[VAL_5]] : f32 to %[[VAL_8]] : f32
-// CHECK-NEXT:      emitc.for %[[VAL_9:.*]] = %[[VAL_0]] to %[[VAL_1]] step %[[VAL_2]] {
-// CHECK-NEXT:        %[[VAL_10:.*]] = arith.addf %[[VAL_8]], %[[VAL_8]] : f32
-// CHECK-NEXT:        emitc.assign %[[VAL_10]] : f32 to %[[VAL_8]] : f32
+// CHECK-NEXT:      %[[VAL_7:.*]] = arith.constant 0 : index
+// CHECK-NEXT:      %[[VAL_8:.*]] = memref.load %[[VAL_4]]{{\[}}%[[VAL_7]]] : memref<1xf32>
+// CHECK-NEXT:      %[[VAL_9:.*]] = memref.alloca() : memref<1xf32>
+// CHECK-NEXT:      %[[VAL_10:.*]] = arith.constant 0 : index
+// CHECK-NEXT:      memref.store %[[VAL_8]], %[[VAL_9]]{{\[}}%[[VAL_10]]] : memref<1xf32>
+// CHECK-NEXT:      emitc.for %[[VAL_11:.*]] = %[[VAL_0]] to %[[VAL_1]] step %[[VAL_2]] {
+// CHECK-NEXT:        %[[VAL_12:.*]] = arith.constant 0 : index
+// CHECK-NEXT:        %[[VAL_13:.*]] = memref.load %[[VAL_9]]{{\[}}%[[VAL_12]]] : memref<1xf32>
+// CHECK-NEXT:        %[[VAL_14:.*]] = arith.addf %[[VAL_13]], %[[VAL_13]] : f32
+// CHECK-NEXT:        %[[VAL_15:.*]] = arith.constant 0 : index
+// CHECK-NEXT:        memref.store %[[VAL_14]], %[[VAL_9]]{{\[}}%[[VAL_15]]] : memref<1xf32>
 // CHECK-NEXT:      }
-// CHECK-NEXT:      emitc.assign %[[VAL_8]] : f32 to %[[VAL_7]] : f32
-// CHECK-NEXT:      emitc.assign %[[VAL_7]] : f32 to %[[VAL_5]] : f32
+// CHECK-NEXT:      %[[VAL_16:.*]] = arith.constant 0 : index
+// CHECK-NEXT:      %[[VAL_17:.*]] = memref.load %[[VAL_9]]{{\[}}%[[VAL_16]]] : memref<1xf32>
+// CHECK-NEXT:      %[[VAL_18:.*]] = arith.constant 0 : index
+// CHECK-NEXT:      memref.store %[[VAL_17]], %[[VAL_4]]{{\[}}%[[VAL_18]]] : memref<1xf32>
 // CHECK-NEXT:    }
-// CHECK-NEXT:    emitc.assign %[[VAL_5]] : f32 to %[[VAL_4]] : f32
-// CHECK-NEXT:    return %[[VAL_4]] : f32
+// CHECK-NEXT:    %[[VAL_19:.*]] = arith.constant 0 : index
+// CHECK-NEXT:    %[[VAL_20:.*]] = memref.load %[[VAL_4]]{{\[}}%[[VAL_19]]] : memref<1xf32>
+// CHECK-NEXT:    return %[[VAL_20]] : f32
 // CHECK-NEXT:  }
diff --git a/mlir/test/Conversion/SCFToEmitC/if.mlir b/mlir/test/Conversion/SCFToEmitC/if.mlir
index afc9abc761eb4..3d394f06f541f 100644
--- a/mlir/test/Conversion/SCFToEmitC/if.mlir
+++ b/mlir/test/Conversion/SCFToEmitC/if.mlir
@@ -53,18 +53,23 @@ func.func @test_if_yield(%arg0: i1, %arg1: f32) {
 // CHECK-SAME:                           %[[VAL_0:.*]]: i1,
 // CHECK-SAME:                           %[[VAL_1:.*]]: f32) {
 // CHECK-NEXT:    %[[VAL_2:.*]] = arith.constant 0 : i8
-// CHECK-NEXT:    %[[VAL_3:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> i32
-// CHECK-NEXT:    %[[VAL_4:.*]] = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f64
+// CHECK-NEXT:    %[[VAL_3:.*]] = memref.alloca() : memref<1xi32>
+// CHECK-NEXT:    %[[VAL_4:.*]] = memref.alloca() : memref<1xf64>
 // CHECK-NEXT:    emitc.if %[[VAL_0]] {
 // CHECK-NEXT:      %[[VAL_5:.*]] = emitc.call_opaque "func_true_1"(%[[VAL_1]]) : (f32) -> i32
 // CHECK-NEXT:      %[[VAL_6:.*]] = emitc.call_opaque "func_true_2"(%[[VAL_1]]) : (f32) -> f64
-// CHECK-NEXT:      emitc.assign %[[VAL_5]] : i32 to %[[VAL_3]] : i32
-// CHECK-NEXT:      emitc.assign %[[VAL_6]] : f64 to %[[VAL_4]] : f64
+// CHECK-NEXT:      %[[VAL_7:.*]] = arith.constant 0 : index
+// CHECK-NEXT:      memref.store %[[VAL_5]], %[[VAL_3]]{{\[}}%[[VAL_7]]] : memref<1xi32>
+// CHECK-NEXT:      memref.store %[[VAL_6]], %[[VAL_4]]{{\[}}%[[VAL_7]]] : memref<1xf64>
 // CHECK-NEXT:    } else {
-// CHECK-NEXT:      %[[VAL_7:.*]] = emitc.call_opaque "func_false_1"(%[[VAL_1]]) : (f32) -> i32
-// CHECK-NEXT:      %[[VAL_8:.*]] = emitc.call_opaque "func_false_2"(%[[VAL_1]]) : (f32) -> f64
-// CHECK-NEXT:      emitc.assign %[[VAL_7]] : i32 to %[[VAL_3]] : i32
-// CHECK-NEXT:      emitc.assign %[[VAL_8]] : f64 to %[[VAL_4]] : f64
+// CHECK-NEXT:      %[[VAL_8:.*]] = emitc.call_opaque "func_false_1"(%[[VAL_1]]) : (f32) -> i32
+// CHECK-NEXT:      %[[VAL_9:.*]] = emitc.call_opaque "func_false_2"(%[[VAL_1]]) : (f32) -> f64
+// CHECK-NEXT:      %[[VAL_10:.*]] = arith.constant 0 : index
+// CHECK-NEXT:      memref.store %[[VAL_8]], %[[VAL_3]]{{\[}}%[[VAL_10]]] : memref<1xi32>
+// CHECK-NEXT:      memref.store %[[VAL_9]], %[[VAL_4]]{{\[}}%[[VAL_10]]] : memref<1xf64>
 // CHECK-NEXT:    }
+// CHECK-NEXT:    %[[VAL_11:.*]] = arith.constant 0 : index
+// CHECK-NEXT:    %[[VAL_12:.*]] = memref.load %[[VAL_3]]{{\[}}%[[VAL_11]]] : memref<1xi32>
+// CHECK-NEXT:    %[[VAL_13:.*]] = memref.load %[[VAL_4]]{{\[}}%[[VAL_11]]] : memref<1xf64>
 // CHECK-NEXT:    return
 // CHECK-NEXT:  }
diff --git a/mlir/test/Target/Cpp/for.mlir b/mlir/test/Target/Cpp/for.mlir
index 60988bcb46556..dc45853a0be04 100644
--- a/mlir/test/Target/Cpp/for.mlir
+++ b/mlir/test/Target/Cpp/for.mlir
@@ -32,85 +32,103 @@ func.func @test_for(%arg0 : index, %arg1 : index, %arg2 : index) {
 // CPP-DECLTOP-NEXT: }
 // CPP-DECLTOP-NEXT: return;
 
-func.func @test_for_yield() {
-  %start = "emitc.constant"() <{value = 0 : index}> : () -> index
-  %stop = "emitc.constant"() <{value = 10 : index}> : () -> index
-  %step = "emitc.constant"() <{value = 1 : index}> : () -> index
-
-  %s0 = "emitc.constant"() <{value = 0 : i32}> : () -> i32
-  %p0 = "emitc.constant"() <{value = 1.0 : f32}> : () -> f32
-
-  %0 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> i32
-  %1 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-  %2 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> i32
-  %3 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
-  emitc.assign %s0 : i32 to %2 : i32
-  emitc.assign %p0 : f32 to %3 : f32
-  emitc.for %iter = %start to %stop step %step {
-    %sn = emitc.call_opaque "add"(%2, %iter) : (i32, index) -> i32
-    %pn = emitc.call_opaque "mul"(%3, %iter) : (f32, index) -> f32
-    emitc.assign %sn : i32 to %2 : i32
-    emitc.assign %pn : f32 to %3 : f32
-    emitc.yield
+func.func @test_for_yield(%arg0: index, %arg1: index, %arg2: index) {
+  %0 = "emitc.constant"() <{value = 0.000000e+00 : f32}> : () -> f32
+  %1 = "emitc.constant"() <{value = 1.000000e+00 : f32}> : () -> f32
+  %2 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> !emitc.array<1xf32>
+  %3 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> !emitc.array<1xf32>
+  %4 = "emitc.constant"() <{value = 0 : index}> : () -> index
+  %5 = emitc.subscript %2[%4] : (!emitc.array<1xf32>, index) -> f32
+  emitc.assign %0 : f32 to %5 : f32
+  %6 = emitc.subscript %3[%4] : (!emitc.array<1xf32>, index) -> f32
+  emitc.assign %1 : f32 to %6 : f32
+  emitc.for %arg3 = %arg0 to %arg1 step %arg2 {
+    %12 = "emitc.constant"() <{value = 0 : index}> : () -> index
+    %13 = emitc.subscript %2[%12] : (!emitc.array<1xf32>, index) -> f32
+    %14 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
+    emitc.assign %13 : f32 to %14 : f32
+    %15 = emitc.subscript %3[%12] : (!emitc.array<1xf32>, index) -> f32
+    %16 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
+    emitc.assign %15 : f32 to %16 : f32
+    %17 = emitc.add %14, %16 : (f32, f32) -> f32
+    %18 = "emitc.constant"() <{value = 0 : index}> : () -> index
+    %19 = emitc.subscript %2[%18] : (!emitc.array<1xf32>, index) -> f32
+    emitc.assign %17 : f32 to %19 : f32
+    %20 = emitc.subscript %3[%18] : (!emitc.array<1xf32>, index) -> f32
+    emitc.assign %17 : f32 to %20 : f32
   }
-  emitc.assign %2 : i32 to %0 : i32
-  emitc.assign %3 : f32 to %1 : f32
-
+  %7 = "emitc.constant"() <{value = 0 : index}> : () -> index
+  %8 = emitc.subscript %2[%7] : (!emitc.array<1xf32>, index) -> f32
+  %9 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
+  emitc.assign %8 : f32 to %9 : f32
+  %10 = emitc.subscript %3[%7] : (!emitc.array<1xf32>, index) -> f32
+  %11 = "emitc.variable"() <{value = #emitc.opaque<"">}> : () -> f32
+  emitc.assign %10 : f32 to %11 : f32
   return
 }
-// CPP-DEFAULT: void test_for_yield() {
-// CPP-DEFAULT-NEXT: size_t [[START:[^ ]*]] = 0;
-// CPP-DEFAULT-NEXT: size_t [[STOP:[^ ]*]] = 10;
-// CPP-DEFAULT-NEXT: size_t [[STEP:[^ ]*]] = 1;
-// CPP-DEFAULT-NEXT: int32_t [[S0:[^ ]*]] = 0;
-// CPP-DEFAULT-NEXT: float [[P0:[^ ]*]] = 1.000000000e+00f;
-// CPP-DEFAULT-NEXT: int32_t [[SE:[^ ]*]];
-// CPP-DEFAULT-NEXT: float [[PE:[^ ]*]];
-// CPP-DEFAULT-NEXT: int32_t [[SI:[^ ]*]];
-// CPP-DEFAULT-NEXT: float [[PI:[^ ]*]];
-// CPP-DEFAULT-NEXT: [[SI:[^ ]*]] = [[S0]];
-// CPP-DEFAULT-NEXT: [[PI:[^ ]*]] = [[P0]];
-// CPP-DEFAULT-NEXT: for (size_t [[ITER:[^ ]*]] = [[START]]; [[ITER]] < [[STOP]]; [[ITER]] += [[STEP]]) {
-// CPP-DEFAULT-NEXT: int32_t [[SN:[^ ]*]] = add([[SI]], [[ITER]]);
-// CPP-DEFAULT-NEXT: float [[PN:[^ ]*]] = mul([[PI]], [[ITER]]);
-// CPP-DEFAULT-NEXT: [[SI]] = [[SN]];
-// CPP-DEFAULT-NEXT: [[PI]] = [[PN]];
+// CPP-DEFAULT: void test_for_yield(size_t v1, size_t v2, size_t v3) {
+// CPP-DEFAULT-NEXT: float [[V4:[^ ]*]] = 0.0e+00f;
+// CPP-DEFAULT-NEXT: float [[V5:[^ ]*]] = 1.000000000e+00f;
+// CPP-DEFAULT-NEXT: float [[V6:[^ ]*]][1];
+// CPP-DEFAULT-NEXT: float [[V7:[^ ]*]][1];
+// CPP-DEFAULT-NEXT: size_t [[V8:[^ ]*]] = 0;
+// CPP-DEFAULT-NEXT: [[V6]][[[V8]]] = [[V4]];
+// CPP-DEFAULT-NEXT: [[V7]][[[V8]]] = [[V5]];
+// CPP-DEFAULT-NEXT: for (size_t [[V9:[^ ]*]] = [[V1]]; [[V9]] < [[V2]]; [[V9]] += [[V3]]) {
+// CPP-DEFAULT-NEXT:   size_t [[V10:[^ ]*]] = 0;
+// CPP-DEFAULT-NEXT:   float [[V11:[^ ]*]];
+// CPP-DEFAULT-NEXT:   [[V11]] = [[V6]][[[V10]]];
+// CPP-DEFAULT-NEXT:   float [[V12:[^ ]*]];
+// CPP-DEFAULT-NEXT:   [[V12]] = [[V7]][[[V10]]];
+// CPP-DEFAULT-NEXT:   float [[V13:[^ ]*]] = [[V11]] + [[V12]];
+// CPP-DEFAULT-NEXT:   size_t [[V14:[^ ]*]] = 0;
+// CPP-DEFAULT-NEXT:   [[V6]][[[V14]]] = [[V13]];
+// CPP-DEFAULT-NEXT:   [[V7]][[[V14]]] = [[V13]];
 // CPP-DEFAULT-NEXT: }
-// CPP-DEFAULT-NEXT: [[SE]] = [[SI]];
-// CPP-DEFAULT-NEXT: [[PE]] = [[PI]];
+// CPP-DEFAULT-NEXT: size_t [[V15:[^ ]*]] = 0;
+// CPP-DEFAULT-NEXT: float [[V16:[^ ]*]];
+// CPP-DEFAULT-NEXT: v16 = v6[v15];
+// CPP-DEFAULT-NEXT: float [[V17:[^ ]*]];
+// CPP-DEFAULT-NEXT: v17 = v7[v15];
 // CPP-DEFAULT-NEXT: return;
 
-// CPP-DECLTOP: void test_for_yield() {
-// CPP-DECLTOP-NEXT: size_t [[START:[^ ]*]];
-// CPP-DECLTOP-NEXT: size_t [[STOP:[^ ]*]];
-// CPP-DECLTOP-NEXT: size_t [[STEP:[^ ]*]];
-// CPP-DECLTOP-NEXT: int32_t [[S0:[^ ]*]];
-// CPP-DECLTOP-NEXT: float [[P0:[^ ]*]];
-// CPP-DECLTOP-NEXT: int32_t [[SE:[^ ]*]];
-// CPP-DECLTOP-NEXT: float [[PE:[^ ]*]];
-// CPP-DECLTOP-NEXT: int32_t [[SI:[^ ]*]];
-// CPP-DECLTOP-NEXT: float [[PI:[^ ]*]];
-// CPP-DECLTOP-NEXT: int32_t [[SN:[^ ]*]];
-// CPP-DECLTOP-NEXT: float [[PN:[^ ]*]];
-// CPP-DECLTOP-NEXT: [[START]] = 0;
-// CPP-DECLTOP-NEXT: [[STOP]] = 10;
-// CPP-D...
[truncated]
 | 
| I'm not so clear about the motivation of this change. From what I understand, SCF lowered correctly to emitc before. | 
| 
 I was about to ask the same question. Are there any benefits? | 
| 
 Correct, this change is not functional. It's a refactoring trying to separate the concerns of (1) lowering value-carrying control-flow ops to emitc through memory and (2) lowering memory ops to emitc. In fact, we can do (1) within the scf dialect and have  
 This PR promotes scalars to 1-element arrays, which indeed leads to less natural C/C++ code. Getting back to natural C/C++ code can be achieved by either: | 
| 
 Ok, now I'm better seeing the whole picture. Let me try to summarize my current understanding: 
 As alternative to 7), you have shown in #92684 how we could do mem2reg as part of the memref-to-emitc lowering to turn memrefs into scalars in emitc. That gives natural C/C++ code but fails to generally convert all rank 0 memrefs - because turning rank 0 memref function arguments into scalars looses their reference semantics. Two things occur to me: And that the emitter needs to understand that  b) Looking more closely at this PR, I don't see the lowering code benefit from unifying handling of memory semantics into the memref dialect. The code complexity is essentially the same, trading op names like  | 
| 
 I'm not sure about the implications of that change, as function arguments are lvalues by design. (In the sense that you are allowed to take the address of a function argument). So I don't know if we need to distinguish lvalues and lvalue references. 
 I would expect that the changes to lvalue variables look about equivalent to the changes here (after adopting the semantics to what we agreed on in #91475). Side note: If I'm reading the code correctly this also unifies iter_args and results to one memref each instead of two. We could apply this change to the current lowering as well, right? | 
| 
 Right. 
 Not exactly: #92684 aims to map rank-0 memrefs (mlir's representation for scalar variables) to  
 
 This sounds like correct lowering when emitting C++. For C, which doesn't support references, lowering could be to an  
 Not sure I follow. As any SSA value in mlir, func.func's arguments and return values are SSA values, not memory locations, e.g. func.func @for_yield(%arg0 : index, %arg1 : index, %arg2 : index) -> (f32, f32) {
  %s0 = arith.constant 0.0 : f32
  %s1 = arith.constant 1.0 : f32
  %result:2 = scf.for %i0 = %arg0 to %arg1 step %arg2 iter_args(%si = %s0, %sj = %s1) -> (f32, f32) {
    %sn = arith.addf %si, %sj : f32
    scf.yield %sn, %sn : f32, f32
  }
  return %result#0, %result#1 : f32, f32
}When lowered to C/C++, these arguments should behave as in-function SSA values in emitc, i.e. be lowered to (const) C arguments whose address cannot be taken. Am I missing something? 
 As you state in (5), emitc's memory ops will usually be the result of lowering memref ops (the standard core dialect for modeling memory), with  If we're at (a), changes in emitc's memory modeling must now affect both  
 I think so, yes. It seems redundant to copy the variables again. | 
| 
 Up for review as #94898 | 
The SCF-to-EmitC lowering of yielded values currently generates
emitc.{variable,assign} ops implemeting the yield logic via memory. This
patch replaces these ops with memref.{alloc, load, store} ops in order
to separate the usage of memory from its concrete modeling in EmitC,
which is left for the memref-to-emitc conversion pass.
Since EmitC currently doesn't support lowering zero-rank memref types,
this patch uses 1D memrefs which translate to C arrays.